30 research outputs found

    Relevance of ASR for the Automatic Generation of Keywords Suggestions for TV programs

    Get PDF
    Semantic access to multimedia content in audiovisual archives is to a large extent dependent on quantity and quality of the metadata, and particularly the content descriptions that are attached to the individual items. However, given the growing amount of materials that are being created on a daily basis and the digitization of existing analogue collections, the traditional manual annotation of collections puts heavy demands on resources, especially for large audiovisual archives. One way to address this challenge, is to introduce (semi) automatic annotation techniques for generating and/or enhancing metadata. The NWO funded CATCH-CHOICE project has investigated the extraction of keywords form textual resources related to the TV programs to be archived (context documents), in collaboration with the Dutch audiovisual archives, Sound and Vision. Besides the descriptions of the programs published by the broadcasters on their Websites, Automatic Speech Transcription (ASR) techniques from the CATCH-CHoral project, also provide textual resources that might be relevant for suggesting keywords. This paper investigates the suitability of ASR for generating such keywords, which we evaluate against manual annotations of the documents and against keywords automatically generated from context documents

    Using XSLT for interoperability: DOE and the travelling domain experiment

    Get PDF
    troncy2003cInternational audienceNo abstract available

    Results of the Ontology Alignment Evaluation Initiative 2009

    Get PDF
    euzenat2009cInternational audienceOntology matching consists of finding correspondences between on- tology entities. OAEI campaigns aim at comparing ontology matching systems on precisely defined test cases. Test cases can use ontologies of different nature (from expressive OWL ontologies to simple directories) and use different modal- ities, e.g., blind evaluation, open evaluation, consensus. OAEI-2009 builds over previous campaigns by having 5 tracks with 11 test cases followed by 16 partici- pants. This paper is an overall presentation of the OAEI 2009 campaign

    Méthodologie linguistique et terminologique pour la structuration d'ontologies différentielles à partir de corpus textuels

    No full text
    Resources like terminologies or ontologies are used in a number of applications, including documentary description and information retrieval. Different methodologies have been proposed to build such resources, on the basis of experts' interviews or of textual corpora. This thesis focuses on the use of existing Natural Language Processing methodologies, meant to help the building of ontologies from textual corpora, to build a particular type of resource : differential ontologies. These ontologies are structured according to a system of semantic identities and differences between their constituents: terms of the domain and categorisation items called “top level categories”.We present different experiments that we have done to elicit, structure, define and “interdefine” the terminological items relevant for a given task. Our first use case was the opales pro ject, in which we had to provide a group of anthropologists with the conceptual vocabulary that they needed to annotate audiovisual documents about childhood. We have used the textual corpus that we have built in this pro ject to test linguistic tools and methodologies for building ontologies from textual data, and we have defined our own programs. The suite of resulting programs is called SODA, and they focus on the extraction and use of defining contexts in corpora to spot terminological items, to structure them and to provide semantic similarity information that enables to compare them.Des ressources telles que les terminologies ou les ontologies sont utilisées dans différentes applications, notamment dans la description documentaire et la recherche d'information. Différentes méthodologies ont été proposées pour construire ce type de ressources, que ce soit à partir d'entrevues avec des experts du domaine ou à partir de corpus textuels. Nous nous intéressons dans ce mémoire à l'utilisation de méthodologies existantes dans le domaine du Traitement Automatique des Langues, destinées à la construction d'ontologies à partir de corpus textuels, pour la construction d'un type de ressource particulier : des ontologies différentielles. Ces ontologies sont structurées selon un système d'identité et de différence sémantique entre leurs constituants : les termes du domaine et des catégories dites "de haut niveau". Nous présentons différentes expérimentations qui ont été menées pour éliciter, structurer, définir et interdéfinir les éléments terminologiques pertinents à la réalisation d'une tâche particulière. Notre premier contexte applicatif a été le projet OPALES, et nous devions fournir à des nthropologue le vocabulaire conceptuel destiné à annoter des documents audiovisuels traitant de la petite enfance. Nous nous sommes servie du corpus constitué à cette occasion pour tester les méthodologies et outils linguistiques proposés pour l'aide à la construction d'ontologie, et avons défini notre propre chaîne de traitement. Celle-ci, appellée SODA, est basée sur l'extraction et l'exploitation d'énoncés définitoires en corpus pour repérer des éléments terminologiques, les structurer et donner des éléments de communauté sémantique permettant de les comparer

    A common multimedia annotation framework for cross linking cultural heritage digital collections

    Get PDF
    In the context of the CATCH research program that is currently carried out at a number of large Dutch cultural heritage institutions our ambition is to combine and exchange heterogeneous multimedia annotations between projects and institutions. As first step we designed an Annotation Meta Model: a simple but powerful RDF/OWL model mainly addressing the anchoring of annotations to segments of the many different media types used in the collections of the archives, museums and libraries involved. The model includes support for the annotation of annotations themselves, and of segments of annotation values, to be able to layer annotations and in this way enable projects to process each other’s annotation data as the primary data for further annotation. On basis of AMM we designed an application programming interface for accessing annotation repositories and implemented it both as a software library and as a web service. Finally, we report on our experiences with the application of model, API and repository when developing web applications for collection managers in cultural heritage institution
    corecore